Multiple linear combination (MLC) regression tests for common variants adapted to linkage disequilibrium structure

نویسندگان

  • Yun Joo Yoo
  • Lei Sun
  • Julia G. Poirier
  • Andrew D. Paterson
  • Shelley B. Bull
چکیده

By jointly analyzing multiple variants within a gene, instead of one at a time, gene-based multiple regression can improve power, robustness, and interpretation in genetic association analysis. We investigate multiple linear combination (MLC) test statistics for analysis of common variants under realistic trait models with linkage disequilibrium (LD) based on HapMap Asian haplotypes. MLC is a directional test that exploits LD structure in a gene to construct clusters of closely correlated variants recoded such that the majority of pairwise correlations are positive. It combines variant effects within the same cluster linearly, and aggregates cluster-specific effects in a quadratic sum of squares and cross-products, producing a test statistic with reduced degrees of freedom (df) equal to the number of clusters. By simulation studies of 1000 genes from across the genome, we demonstrate that MLC is a well-powered and robust choice among existing methods across a broad range of gene structures. Compared to minimum P-value, variance-component, and principal-component methods, the mean power of MLC is never much lower than that of other methods, and can be higher, particularly with multiple causal variants. Moreover, the variation in gene-specific MLC test size and power across 1000 genes is less than that of other methods, suggesting it is a complementary approach for discovery in genome-wide analysis. The cluster construction of the MLC test statistics helps reveal within-gene LD structure, allowing interpretation of clustered variants as haplotypic effects, while multiple regression helps to distinguish direct and indirect associations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-bin multi-variant tests for gene-based linear regression analysis of genetic association

By jointly analyzing multiple variants within a gene together, instead of one at a time, gene-based regression analysis can improve power and robustness of genetic association analysis. Extending prior work that examined multi-bin linear combination (MLC) statistics for combined analysis of rare and common variants, here we investigate analysis of common variants more extensively under realisti...

متن کامل

Gene-based multiple regression association testing for combined examination of common and low frequency variants in quantitative trait analysis

Multi-marker methods for genetic association analysis can be performed for common and low frequency SNPs to improve power. Regression models are an intuitive way to formulate multi-marker tests. In previous studies we evaluated regression-based multi-marker tests for common SNPs, and through identification of bins consisting of correlated SNPs, developed a multi-bin linear combination (MLC) tes...

متن کامل

Typing for MLC (LD). I. Lymphocytes from cousin-marriage offspring as typing cells.

WHEN it was observed that the majority of SD-identical unrelated individuals stimulated each other in the MLC test, it was postulated in agreement with earlier suggestions by others that MLC activation might be coded by a locus separate from the LA and Four loci." The existence of such a MLC locus is now generally accepted.^About 10% of the MLC tests between SD-identical unrelated persons are n...

متن کامل

نقشه یابی ارتباطی صفات زراعی در توتون‌های شرقی (Nicotiana tabacum L.)

Tobacco (Nicotiana tabacum L.) is one of valuable agricultural and industrial crops. Studying most important traits of tobacco is difficult because of quantitative nature that are controlled by multiple genes and affected by environmental factors. Among various methods for the study of quantitative traits, association mapping which utilize phenotypic and DNA markers information is one of the ef...

متن کامل

Mendelian Randomization Analysis With Multiple Genetic Variants Using Summarized Data

Genome-wide association studies, which typically report regression coefficients summarizing the associations of many genetic variants with various traits, are potentially a powerful source of data for Mendelian randomization investigations. We demonstrate how such coefficients from multiple variants can be combined in a Mendelian randomization analysis to estimate the causal effect of a risk fa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 41  شماره 

صفحات  -

تاریخ انتشار 2017